Using Multiple Query Aspects to Build Test Collections without Human Relevance Judgments

نویسنده

  • Miles Efron
چکیده

Collecting relevance judgments (qrels) is an especially challenging part of building an information retrieval test collection. This paper presents a novel method for creating test collections by offering a substitute for relevance judgments. Our method is based on an old idea in IR: a single information need can be represented by many query articulations. We call different articulations of a particular need query aspects. By combining the top k documents retrieved by a single system for multiple query aspects, we build judgment-free qrels whose rank ordering of IR systems correlates highly with rankings based on human relevance judgments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query polyrepresentation for ranking retrieval systems without relevance judgments

Ranking information retrieval (IR) systems with respect to their effectiveness is a crucial operation during IR evaluation, as well as during data fusion. This paper offers a novel method of approaching the system ranking problem, based on the widely studied idea of polyrepresentation. The principle of polyrepresentation suggests that a single information need can be represented by many query a...

متن کامل

An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection

OBJECTIVES The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually gener...

متن کامل

Efficient Test Collection Construction via Active Learning

To create a new IR test collection at minimal cost, we must carefully select which documents merit human relevance judgments. Shared task campaigns such as NIST TREC determine this by pooling search results from many participating systems (and often interactive runs as well), thereby identifying the most likely relevant documents in a given collection. While effective, it would be preferable to...

متن کامل

A Comparison of Pooled and Sampled Relevance Judgments in the TREC 2006 Terabyte Track

Pooling is the most common technique used to build modern test collections. Evidence is mounting that pooling may not yield reusable test collections for very large document sets. This paper describes the approach taken in the TREC 2006 Terabyte Track: an initial shallow pool was judged to gather relevance information, which was then used to draw a random sample of further documents to judge. T...

متن کامل

Selecting a Subset of Queries for Acquisition of Further Relevance Judgements

Assessing the relative performance of search systems requires the use of a test collection with a pre-defined set of queries and corresponding relevance assessments. The state-ofthe-art process of constructing test collections involves using a large number of queries and selecting a set of documents, submitted by a group of participating systems, to be judged per query. However, the initial set...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009